Disputation
16 April 2024
University of Mannheim
Which methods can we use to classify data from open-ended survey questions?
Can we leverage these methods to make empirical contributions to substantial questions?
Motivation:
➡️ The increase in methods to collect natural language (e.g., smartphone surveys and voice technologies) calls for testing and validating automated methods to analyze the resulting data.
➡️ Open-ended survey answers pose a unique challenge for ML applications due to their shortness and lack of context. An effective analysis might require the use of suitable methods, e.g., word embeddings, structural topic models.
Figure 1: The previous question was: ‘How often can you trust the federal government in Washington to do what is right?’. Your answer was: ‘[Always; Most of the time; About half of the time; Some of the time; Never; Don’t Know]’. In your own words, please explain why you selected this answer.
Table 1. Overview of methods for classifying open-ended survey responses
| Study 1 | Study 2 | Study 3 |
|---|---|---|
| How valid are trust survey measures? New insights from open-ended probing data and supervised machine learning | Open-ended survey questions: A comparison of information content in text and audio response format | Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys? |
Landesvatter, C., & Bauer, P. C. (2024). How Valid Are Trust Survey Measures? New Insights From Open-Ended Probing Data and Supervised Machine Learning. Sociological Methods & Research, 0(0). https://doi.org/10.1177/00491241241234871
Landesvatter, C., & Bauer, P. C. (February 2024). Open-ended survey questions: A comparison of information content in text and audio response formats. Working Paper submitted to Public Opinion Quarterly.
Landesvatter, C., & Bauer, P. C. (March 2024). Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys?. Working Paper submitted to American Political Science Review.
Operationalization via sentiment and emotion analysis
Transcript-based
Speech-based
Facilitated accessibility and implementation of semi-automated methods.
large and general aim pre-trained models (e.g., BERT, GPT) allow less resource-intensive fine-tuning (compared to traditional supervised models)
But: these models come along a lack of transparency
Increase in possibilities of fully automated methods (e.g., prompt engineering.
Landesvatter: Methods for the Classification of Data from Open-Ended Questions in Surveys